-
Notifications
You must be signed in to change notification settings - Fork 586
RRD footers 3: encoding/decoding manifests #12048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Web viewer failed to build. | Result | Commit | Link | Manifest | View image diff on kitdiff. Note: This comment is updated whenever you push a commit. |
|
Latest documentation preview deployed successfully.
Note: This comment is updated whenever you push a commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements encoding and decoding of RRD manifests in footers for the Rerun RRD file format. The changes enable random access to chunks within RRD files by storing metadata about chunk locations and properties in footer manifests.
Key changes:
- Adds manifest building during encoding, tracking chunk metadata (offsets, sizes, entity paths, etc.)
- Implements manifest parsing during decoding with transport-to-application conversion
- Adds CLI support for displaying parsed footers (
--footersflag) and recomputing manifests during routing (--recompute-manifestsflag)
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/content/reference/cli.md | Documents new CLI flags for footer display and manifest recomputation |
| crates/top/rerun/src/commands/stdio.rs | Updates stream reading to return RRD manifests alongside messages |
| crates/top/rerun/src/commands/rrd/route.rs | Adds --recompute-manifests flag to enable footer generation during routing |
| crates/top/rerun/src/commands/rrd/print.rs | Implements --footers flag to display parsed footer manifests |
| crates/top/rerun/src/commands/rrd/merge_compact.rs | Updates to handle new return signature with footer metadata |
| crates/top/rerun/src/commands/rrd/filter.rs | Updates to handle new return signature with footer metadata |
| crates/store/re_log_encoding/tests/snapshots/* | Adds test snapshots for manifest data and schema validation |
| crates/store/re_log_encoding/tests/footer_roundtrip.rs | Comprehensive roundtrip test for footer encoding/decoding |
| crates/store/re_log_encoding/src/rrd/log_msg.rs | Implements Encodable/Decodable traits for RrdManifest |
| crates/store/re_log_encoding/src/rrd/encoder.rs | Implements manifest building during message encoding |
| crates/store/re_log_encoding/src/rrd/decoder/stream.rs | Adds rrd_manifests() method to async decoder |
| crates/store/re_log_encoding/src/rrd/decoder/state_machine.rs | Accumulates manifests during decoding and implements manifest extraction |
| crates/store/re_log_encoding/src/rrd/decoder/iterator.rs | Adds rrd_manifests() method to sync decoder iterator |
| crates/store/re_log_encoding/Cargo.toml | Adds insta test dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
964c086 to
c1e6776
Compare
f1ad3fb to
92b97b9
Compare
| } | ||
|
|
||
| #[test] | ||
| fn footer_roundtrip() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably the most important part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(really nice test!)
zehiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went through all 3 PRs, it was really nicely documented and easy to follow, from the state machine updated, through rrd manifest builder (which I luckily knew bit about). nice work!
| } | ||
|
|
||
| #[test] | ||
| fn footer_roundtrip() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(really nice test!)
emilk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay!
| // bit weirder on the other hand, but then again this is generally not a new a | ||
| // problem: we tend to perform Sorbet migrations a bit too aggressively all over | ||
| // the place. We really need a layer that sits between the transport and | ||
| // application layer where one can accessed the parsed, unmigrated data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100%
| // Right now, the implemented behavior is that we end up with an empty footer, i.e. there are | ||
| // no manifests in it. | ||
| // Whether that's the correct behavior is another question, but at least it is defined for now | ||
| // and can be changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love these kind of comments ⭐⭐⭐
|
@rerun-bot full-check |
|
Started a full build: https://github.com/rerun-io/rerun/actions/runs/19887096962 |
|
Looking into why the full-check is not passing anymore :notsure: Oke first problem, why does this code not log fn main() -> Result<(), Box<dyn std::error::Error>> {
let rec = rerun::RecordingStreamBuilder::new("rerun_example_different_data_per_timeline")
.save("diffdata.rrd")?;
rec.set_time_sequence("blue timeline", 0);
rec.set_duration_secs("red timeline", 0.0);
rec.log("points", &rerun::Points2D::new([(0.0, 0.0), (1.0, 1.0)]))?;
// Log a red color on one timeline.
rec.reset_time(); // Clears all set timeline info.
rec.set_duration_secs("red timeline", 1.0);
rec.log(
"points",
&rerun::Points2D::update_fields().with_colors([0xFF0000FF]),
)?;
// And a blue color on the other.
rec.reset_time(); // Clears all set timeline info.
rec.set_time_sequence("blue timeline", 1);
rec.log(
"points",
&rerun::Points2D::update_fields().with_colors([0x0000FFFF]),
)?;
// TODO(#5521): log VisualBounds2D
Ok(())
}TODO: what about C++? Python is the one actually failing, presumably because it's the only one actually trying to log an End message ??
Interestingly, migrating real stuff like Also: do we |
|
@rerun-bot full-check |
|
Started a full build: https://github.com/rerun-io/rerun/actions/runs/19889120354 |
|
Alright, whitespaces in index names was the answer, ofc it was. (Maybe, not confirmed) This unraveled some other unrelated flushing bugs too, it seems. I'll have to look at those in follow ups. |
LLM summary: > This PR introduces the foundational framing infrastructure for RRD stream footers, which will enable richer metadata and improved stream navigation. It establishes the protocol-level types and state machine changes needed to support footers without yet populating them with content. > > Key changes include: > - Addition of `RrdFooter` message types at both transport (protobuf) and application levels > - Introduction of `StreamFooter` frame for locating and validating RRD footers > - Enhanced decoder state machine to handle the new footer frames > - Updated encoder to emit footers with basic state tracking infrastructure To which I would add the following: This introduces all the framing infrastructure so that RRD streams are now capable of carrying footers in all circumstances (including pipes). Specifically, this introduces a couple new types: * `RrdFooter`, a(n empty) protobuf message that will from now on act as the payload for messages of type `MessageKind::End`. * `MessageKind::End` isn't new, it's something that was always there, but until now was always empty. * As always, this has both a cheap transport-level definition and a less cheap app-level definition. * `StreamFooter`, a simple binary frame that mirrors `StreamHeader`, and whose main job is to keep track of where the `RrdFooter`. * This is used for O(1) access to the `RrdFooter`, e.g. when registering data on a Redap-compliant server. For now, these footers are always an empty protobuf messages. We will be filling them up in the following PRs. --- Part of RRD footers series of PRs: * #12044 * #12047 * #12048 * rerun-io/dataplatform#2060 ## TODO * [x] `@rerun-bot full-check` --------- Co-authored-by: Copilot <[email protected]>
eae5d66 to
fd1e2ef
Compare
LLM summary: > This PR introduces RRD manifest support as the second part of the RRD footers series. The manifest provides an index of chunks in an RRD recording, enabling efficient query planning without loading actual data. Key changes include: > - Extending `RrdFooter` and adding `RrdManifest` protobuf definitions > - Implementing `RrdManifest` and `RrdManifestBuilder` to catalog chunks with their metadata (ID, size, offset, timeline ranges) > - Adding transport layer conversions between application-level and protobuf types > - Refactoring schema hash computation to a shared utility method To which I'll add: This PR introduces the core datastructure that this whole footer business is all about: the `RrdManifest`. The `RrdManifest` is a protobuf message (which, as always, comes with both a transport-level and application-level implementations) which carries various important metadata about the associated recording. The most important piece of metadata is the actual manifest record-batch, which lists all the chunks in the recording at at fine-grained enough level of detail that relevancy queries become possible without ever having to load the data in a local chunk-store first. Effectively, you can view record-batch is a dataframe representation of the time panel. There is a lot of code, but it's all fairly trivial and mechanical, bordering on boilerplate in some cases. In fact, what you should really focus on is the fact that this code: https://github.com/rerun-io/rerun/blob/ad754fab94517e80866f7911164a7622b932de74/crates/store/re_log_encoding/tests/footers_and_manifests.rs#L116-L173 yields this manifest record-batch: <img width="1137" height="1468" alt="image" src="https://github.com/user-attachments/assets/c7548fb8-4ea0-47fd-98c6-3783c2095e85" /> with this schema: https://github.com/rerun-io/rerun/blob/ad754fab94517e80866f7911164a7622b932de74/crates/store/re_log_encoding/tests/snapshots/footers_and_manifests__simple_manifest_batch_schema.snap#L5-L117 --- Part of RRD footers series of PRs: * #12044 * #12047 * #12048 * rerun-io/dataplatform#2060 --------- Co-authored-by: Copilot <[email protected]>
180e33e to
6612d8f
Compare
|
Okay the diff is back to looking sane, not sure why git-rebase went completely haywire on this one 😑 |
LLM summary:
To which I actually don't have all that much to add.
This PR is basically all the remaining glue so that, whenever one uses our
Encoderor one of ourDecodervariants, RRD footers and manifests will automagically be computed, injected and serialized/deserialized.The most important part of this PR is arguably the addition of a
footer_roundtriptest, that encodes a recording and then manually decodes all of its chunks directly using the generated RRD manifest, instead of using aDecoder.Part of RRD footers series of PRs: